home *** CD-ROM | disk | FTP | other *** search
Text File | 1991-04-20 | 40.2 KB | 1,278 lines |
- Request for Comments: 713 Jack Haverty (JFH@MIT-DMS)
- NIC #34739 Apr 1976
-
-
-
-
-
-
-
- I. ABSTRACT
-
-
- A mechanism is defined for use by message servers in
- transferring data between hosts. The mechanism, called the
- MSDTP, is defined in terms of a model of the process as a
- translation between two sets of items, the abstract entities
- such as 'strings' and 'integers', and the formats used to
- represent such data as a byte stream.
-
- A proposed organization of a general data transfer
- mechanism is described, and the manner in which the MSDTP
- would be used in that environment is presented.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- -1-
-
- II. REFERENCES
-
-
- Black, Edward H., "The DMS Message Composer", MIT Project
- MAC, Programming Technology Division Document
- SYS.16.02.
-
- Burchfiel, Jerry D., Leavitt, Elsie M., Shapiro, Sonya and
- Strollo, Theodore R., compilers, "Tenex Users' Guide",
- Bolt Beranek and Newman, Cambridge, Mass., May 1971,
- revised January 1975, Descriptive sections on the TENEX
- subsystems: MAlLER, p. 116-11; MAlLSTAT, p. 118-119;
- READMAIL, p. 137; and SNDMSG, p. 165-170.
-
- Haverty, Jack, "Communications System Overview", MIT Project
- MAC, Programming Technology Division Document
- SYS.16.00.
-
- Haverty, Jack, "Communications System Daemon Manual", MIT
- Project MAC, Programming Technology Division Document
- SYS.16.01.
-
- ISI Information Automation Project, "Military Message
- Processing System Design," Internal Project
- Documentation (Out of Print), Jan. 1975
-
- Message Services Committee, "Interim Report", Jan. 28, 1975
-
- Mooers, Charlotte D., "Mailsys Message System: Manual For
- Users", Bolt Beranek and Newman, Cambridge, Mass., June
- 1975 (draft).
-
- Myer, Theodore H., "Notes On The BBN Mail System", Bolt
- Beranek and Newman, November 8, 1974.
-
- Myer, Theodore H., and Henderson, D. Austin, "Message
- Transmission Protocol", Network Working Group RFC 680,
- NIC 32116, April 30, 1975.
-
- Postel, Jon, "The PCPB8 Format", NSW Proposal, June 5, 1975
-
- Tugender, R., and D. R. Oestreicher, "Basic Functional
- Capabilities for a Military Message Processing
- Service," ISI?RR-74-23., May 1975
-
- Vezza, Al, "Message Services Committee Minority Report",
- Jan. 1975
-
-
-
-
-
-
-
-
-
-
- -2-
-
- III. OVERVIEW
-
-
- This document describes a mechanism developed for use
- by message servers communicating over an eight-bit
- byte-oriented network connection to move data structures and
- associated data-typing information. It is presented here in
- the hope that it may be of use to other projects which need
- to transfer data structures between dissimilar hosts.
-
- A set of abstract entities called PRIMITIVE ITEMS is
- enumerated. These are intended to include traditional data
- types of general utility, such as integers, strings, and
- arrays.
-
- A mechanism is defined for augmenting the set of
- abstract data entities handled, to allow the introduction of
- application-specific data, whose format and semantics are
- understood by the application programs involved, but which
- can be transmitted using common coding facilities. An
- example might be a data structure called a 'file
- specification', or a 'date'. Abstract data entities defined
- using this mechanism will be termed SEMANTIC ITEMS, since
- they are typically used to carry data having semantic
- content in the application involved.
-
- Semantic and primitive items are collectively referred
- to simply as ITEMS.
-
- The protocol next involves the definition of the format
- of the byte stream used to convey items from machine to
- machine. These encodings are described in terms of OBJECTS,
- which are the physical byte streams transmitted.
-
- To complete the protocol, the rules for translating
- between objects and items are presented as each object is
- defined.
-
- An item is transmitted by being translated into an
- object which is transmitted over the connection as a stream
- of bytes to the receiver, and reconstructed there as an
- item. The protocol mechanism may thus be viewed as a simple
- translator. It enumerates a set of abstract entities, the
- items, which are known to programmers, a set of entities in
- byte-stream format, the objects, and the translation rules
- for conversion between the sets. A site implementing the
- MSDTP would typically provide a facility to convert between
- objects and the local representation of the various items
- handled. Applications using the MSDTP define their
- interactions using items, without regard to the actual
- formats in which such items are represented at various
- machines. This permits programs to handle higher-level
- concepts such as a character string, without concern for its
- numerous representational formats. Such detail is handled
- by the MSDTP.
-
- -3-
-
-
- Finally, a discussion of a general data transfer
- mechanism for communication between programs is presented,
- and the manner in which the particular byte-oriented
- protocol defined herein would be used in that environment is
- discussed.
-
- Terminology, as introduced, is defined and highlighted
- by capitalizing.
-
-
- IV. PRIMITIVE DATA ITEMS
-
- The primitive data items include a variety of
- traditional, well-understood types, such as integers and
- strings. Primitive data items will be presented using
- mnemonic names preceded by the character pair "p-", to serve
- as a reminder that the named object is primitive.
-
- These items may be represented in various computer
- systems in whatever fashion their programmers desire.
-
-
- IV.1 -- Set Of Primitive Items
-
-
- The set of primitive items defined includes p-INT,
- p-STRING, p-STRUC, p-BITS, p-CHAR, p-BOOL, p-EMPTY, and
- p-XTRA.
-
- Since the protocol was developed primarily for use in
- message services, items such as p-FLOAT are not included
- since they were unnecessary. Additional items may be easily
- added as necessary.
-
- A p-INT performs the traditional role of representing
- integer numbers. A p-BITS (BIT Stream) item represents a
- bit stream. The two possible p-BOOL (BOOLean) items are
- used to represent the logical values of *TRUE* and *FALSE*.
- The single p-EMPTY item is used to, for example, indicate
- that a given field of a message is empty. It is provided to
- act as a place-holder, representing 'no data', and appears
- as *EMPTY*.
-
- The p-STRUC (STRUCture) item is used to group together
- a collection of items as a single value, maintaining the
- ordering of the elements, such as a p-STRUC of p-INTs.
-
- A p-CHAR is a single character. The most common
- occurrence of character data, however, will be as p-STRINGs.
- A p-STRING should be considered to be a synonym for a
- p-STRUC containing only p-CHARs. This concept is important
- for generality and consistency, especially when considering
- definitions of permissible operations on structures, such as
- extracting subsequences of elements, etc.
-
- -4-
-
- Four p-XTRA items, which can be transmitted in a single
- byte, are made available for higher level protocols to use
- when a frequently used datum is handled which can be
- represented just by its name. An example would be an
- acknowledgment between two servers. Using p-XTRAs to
- represent such data permits them to be handled in a single
- byte. There are four possible p-XTRA items, termed *XTRA0*,
- *XTRA1*, *XTRA2*, and *XTRA3*. These may be assigned
- meanings by user protocols as desired.
-
-
- IV.2 -- Printing Conventions
-
-
- The following printing conventions are introduced to
- facilitate discussion of the primitive items.
-
- When a specific instance of a primitive data item is
- presented, it will be shown in a traditional representation
- for that kind of data. For example, p-INTs are shown as
- sequences of digits, e.g. 100, p-STRINGs, as sequences of
- characters enclosed in double-quote characters, for example
- "ABCDEF".
-
- As shown above, the two possible p-BOOL items are shown
- as *TRUE* or *FALSE*. The object p-EMPTY appears as
- *EMPTY*. A bit stream, i.e. p-BITS, appears as a stream of
- 1s and 0s enclosed in asterisks, for example *100101001*. A
- p-CHAR will be presented as the character enclosed in single
- quote characters, e.g., 'A'.
-
- P-STRUCs are printed as the representations of their
- elements, enclosed in parentheses, for example (1 2 3 4) or
- ("XYZ" "ABC" 1 2) or ((1 2 3) "A" "B"). Note that because
- p-STRINGs are simply a class of p-STRUCs assigned a special
- name and printing format for brevity and convenience, the
- items "ABC" and ('A' 'B' 'C') are identical, and the latter
- format should not be used.
-
- To present a generic p-STRUC, as in specifying formats
- of the contents of something, the items are presented as a
- mnemonic name, optionally followed by a colon and the
- permissible types of values for that datum. When one of
- several items may appear as the value for some component,
- the permissible ones appear separated by vertical-bar
- characters. For example, p-INT|p-STRING represents a single
- item, which may be either a p-INT or a p-STRING.
-
- To represent a succession of items, the Kleene star
- convention is used. The specification p-INT[*] represents
- any number of p-INTs. Similarly, p-INT[3,5] represents from
- 3 to 5 p-INTs, while p-INT[*,5] specifies up to 5 and
- p-iNT[5,*] specifies at least 5 p-INTs.
-
-
-
- -5-
-
- For example, a p-STRUC which is used to carry names and
- numbers might be specified as follows.
-
- (name:p-STRING number:p-INT)
-
- In discussing items in general, when a specific data
- value is not intended, the name and types representation may
- be used, e.g., offset:p-INT to discuss an 'offset' which has
- a numeric value.
-
-
- V. SEMANTIC ITEM MECHANISM
-
-
- The semantic item mechanism provides a means for
- program designers to use a variety of application-specific
- data items.
-
- This mechanism is implemented using a special tagged
- structure to carry the data type information as well as the
- actual components of the particular semantic item. For
- discussion purposes. Such a special p-STRUC will be termed a
- p-EDT (Extended Data Type).
-
- When p-EDTs are transferred, their identity as a p-EDT
- is maintained. So that an applications program receives the
- corresponding semantic item instead of a simple p-STRUC. A
- p-EDT is identical to a p-STRUC in all other respects.
-
-
- V.1 -- Format of p-EDTs
-
-
- A prototypical p-EDT follows. It is printed as if it
- were a normal p-STRUC. Since p-EDTs are converted to
- semantic items for presentation to the user, a p-EDT will
- never be used except in this protocol definition.
-
- (type:p-INT|p-STRING version:p-INT com1:any
- com2:any ...)
-
- The first element, the 'type' is generally a p-INT, and
- is used to identify the particular type of semantic item.
- Types are assigned numeric codes in a controlled fashion.
- The type may alternatively be specified by a p-STRING, to
- permit development of new data types for possible later
- assignment of codes. Each type has an equivalent p-STRING
- name. These may be used interchangeably as 'type' elements,
- primarily to maintain upward compatibility.
-
- The second element of a p-EDT is always an p-INT, the
- 'version', and specifies the exact format of the particular
- datum. A semantic item may undergo several revisions of its
- internal structure. Which would be evident through assigning
- different versions to each revision.
-
- -6-
-
- Successive components. The 'com' elements, if any.
- carry the actual data of the semantic item. As each
- semantic item is defined, conventions on permissible values
- and interpretation of these components are presented. Such
- definitions may use any types of items to specify the format
- of the semantic item. Use of lower level concepts, such as
- objects, in these definitions is prohibited.
-
- Semantic items will be printed as the mnemonic for the
- type involved, preceded by the character pair "s-", to
- signify that the data item is handled by this mechanism.
-
-
- V.2 -- Printing Conventions
-
-
- A semantic item is represented as if it were a p-STRUC
- containing only the components, if any, but preceded by the
- semantic type name and a # character. The version number is
- assumed to be 1 if unspecified. For later versions, the
- version number is attached to the type name, as in, for
- example, FILE-2 to represent version 2 of the FILE data
- type.
-
- For example, a semantic item called a 'file
- specification' might be defined, containing two components,
- a host number and pathname. A specific instance of such an
- item might appear as #FILE(69 "DIRECTORY.NAME-OF-FILE"),
- while a generic s-FILE might be presented as the following.
-
- #FILE(host:p-INT|p-STRING pathname:p-STRING)
-
-
- the item, which may be either a p-INT or p-STRING, and
- 'pathname' is the second component, which must be a
- p-STRING. The full definition would present interpretation
- rules for these components.
-
-
- VI. ENCODING OBJECTS
-
-
- This section presents the set of objects which are used
- to represent items as byte streams for inter-server
- transmission. Objects will be presented using mnemonic
- type-names preceded by the character pair "b-", indicating
- their existence only as byte streams.
-
- All servers are required to be capable of decoding the
- entire set of objects. Servers are not required to transmit
- certain objects which are available to improve channel
- efficiency.
-
-
-
-
- -7-
-
- The encodings are designed to facilitate programming
- and efficiency of the receiving decoder. In all cases, the
- type and length in bytes of objects is supplied as the first
- information sent. This characteristic is important for ease
- of implementation. The type information permits a decoder to
- be constructed in a modular fashion. The most important
- advantage of including size information is that the receiver
- always knows how many bytes it must read to discover what to
- do next, and knows when each object terminates. This
- requirement avoids many potential problems with timing and
- synchronization of processes.
-
- Two varieties of objects are defined. The first will
- be called ATOMIC, and includes objects used to efficiently
- encode the most common data. The second variety is termed
- NON-ATOMIC, and is used to encode larger or less common
- items.
-
- In all cases, a data object begins with a single byte,
- which will be termed the TYPE-BYTE, a field of which
- contains the type code of the object. The following bytes,
- if any, are interpreted according to the type involved.
-
-
- VI.1 -- Presentation Conventations
-
-
- In discussing formats of bytes, the following
- conventions will be employed. The individual bits of a byte
- will be referenced by using capital letters from A to H,
- where A signifies the highest order bit, and H the lowest.
- The entire eight bit value, for example, could be referred
- to as ABCDEFGH. Similarly, subfields of the byte will be
- identified by such sequences. The CDEF field specifies the
- middle four bits of a byte.
-
- In referring to values of fields, binary format will be
- used, and small letters near the end of the alphabet will be
- used to identify particular bits for discussion. For
- example, we might say that the BCD field of a byte contains
- a specifier for some type, and define its value to be
- BCD=11z. In discussions of the specifier usage, we could
- refer to the cases where z=l and where z=0, as shorthand
- notation to identify BCD=111 and BCD=110, respectively.
-
-
- V1.2 -- Type-Byte Bit Assignment
-
-
- To assist in understanding the assignment of the
- various type-byte values, the table and graph below are
- included, showing representations of the eight bits.
-
-
-
-
- -8-
-
- OXXXXXXX -- CHAR7 (CHARacter, 7 bit)
- 10XXXXXX -- SINTEGER (Small INTEGER)
- l10XXXXX -- NON-ATOM (NON-ATOMic objects)
- 11100XXX -- LINTEGER (Large INTEGER)
- 11101XXX -- reserved
- 11110XXX -- SBITSTR (Short BIT STReam)
- 111110XX -- XTRA (eXTRA single-byte objects)
- 1111110X -- BOOL (BOOLean)
- 11111110 -- EMPTY (EMPTY data item)
- 11111111 -- PADDING (unused byte)
-
-
- In each case, the bits identified by X's are used to
- contain information specific to the type involved. These
- are explained when each type is defined.
-
- An equivalent tree representation follows, for those
- who prefer it.
- start with high order bit
- |
- |
- |
- 0-----0-----0-----0-----0-----0-----0-----0-----X
- | | | | | | | | PADDING
- 0| 0| 0| 0| 0| 0| 0| 0|
- | | | | | | | |
- X | X | X | X X
- CHAR7 | NON-ATOM | BITS | BOOL EMPTY
- (7) | (5) | (3) | (1)
- | 0| | |
- SINTEGER | XTRA
- (6) | (2)
- LINTEGER
- (3)
-
- Type-Byte Bit Assignment Scheme
-
-
-
-
- This picture is interpreted by entering at the top, and
- taking the appropriate branch at each node to correspond to
- the next bit of the type-byte, as it is scanned from left to
- right. When a type is assigned, the branch terminates with
- an "X' and the name of the type of the object, with the
- number of remaining bits in parentheses. The individual
- object definitions specify how these bits are used for that
- particular type.
-
-
- V1.3 -- Atomic Objects
-
-
- Atomic objects are identified by specific patterns in a
- type-byte. Receiving servers must be capable of recognizing
-
-
- -9-
-
- and handling all atomic types, since the size of the object
- is not explicitly present in a uniform fashion.
-
-
- ================================
- | Atomic Object: B-CHAR7 |
- ================================
-
-
- The b-CHAR7 (CHARacter 7 bit) object is introduced to
- handle transmission of characters, in 7-bit ASCII format.
- Since the vast majority of message-related data involves
- such objects, they are designed to be very efficient in
- transmission. Other formats, such as eight bit values, can
- be introduced as non-atomic objects. The format of a b-CHAR7
- follows:
-
- A=0 identifying the b-CHAR7 data type
- BCDEFGH=tuvwxyz containing the character
- code
-
- The tuvwxyz objects contain the ASCII code of the
- character. For example, transmission of a "space' (ASCII
- code 32, 40 octal) would be accomplished by the following
- byte.
-
- 00100000
- ABCDEFGH
-
- A=0 to identify this byte as a b-CHAR7. The remaining
- bits contain the 7 bit code, octal 40, for space.
-
- A b-CHAR7 standing alone is presented as a p-CHAR.
- Such occurrences will probably be rare if they are used at
- all. The most common use of b-CHAR7's is as elements of
- b-USTRUCs used to transmit p-STRINGS, as explained later.
-
-
- =============================
- | Atomic Object: B-SINTEGER |
- =============================
-
- The b-SINTEGER (Small INTEGER) object is used to
- transmit very small positive integers, of values up to 64.
- It always translates to an p-INT, and any p-INT between 0
- and 63 may be encoded as a b-SINTEGER for transmission. The
- format of an b-SINTEGER follows.
-
- AB=10 identifying the object as a b-SINTEGER
- CDEFGH=uvwxyz containing the actual number
-
- For example, to transmit the integer 10 (12 octal), the
- following byte would be transmitted:
-
- 10001010
- ABCDEFGH
-
- -10-
-
- AB=10 to specify a b-SINTEGER. The remaining six bits
- contain the number 10 expressed in binary.
-
- =============================
- | Atomic Object: B-SINTEGER |
- =============================
-
- The b-SINTEGER (Large INTEGER) object is used to
- transmit p-INTs to any precision up to 64 bits. It is
- always translated as a p-INT. Sending servers are permitted
- to choose either b-SINTEGER or b-SINTEGER format for
- transmission of numbers, as appropriate. When possible,
- b-SINTEGERs can be used for better channel efficiency. The
- format of a b-SINTEGER follows:
-
- ABCDE=11100 specifying that this is a b-SINTEGER.
- FGH=xyz containing a count of number of bytes to follow.
-
- The xyz bits are interpreted as a number of bytes to
- follow which contain the actual binary code of the the
- integer in 2's complement format. Since a zero-byte integer
- is disallowed, the pattern xyz=000 is interpreted as 1000,
- specifying that 8 bytes follow. The number is transmitted
- with high-order bits first. This format permits
- transmission of integers as large as 64 bits in magnitude.
-
- For example, if the number 4096 (10000 octal) is to be
- transmitted, the following sequence of bytes would be sent:
-
- 11100010 00010000 00000000
- ABCDEFGH ---actual data---
-
- ABCDE=11100, identifying this as a b-LINTEGER, E=0,
- specifying a positive number, and FGH=010, specifying that 2
- bytes follow, containing the actual binary number.
-
- ============================
- | Atomic Object: B-SBITSTR |
- ============================
-
- The b-SBITSTR (Short BIT STReam) object is used to
- transmit a p-BITS of length 63 or less. For longer bit
- streams, the non-atomic object b-LBITSTR may be used. The
- format of a b-SBITSTR follows.
-
- ABCDE=11110 specifying the type as b-SBITSTR
- FGH=xyz specifying the number of bytes
- following.
-
-
-
-
-
-
-
- -11-
- The xyz value specifies the number of additional bytes
- to be read to obtain the bit stream values. As in the case
- of b-SINTEGER, the value xyz=000 is interpreted as 1000,
- specifying that 8 bytes follow.
-
- To avoid requiring specification of exactly the number
- of bits contained, the following convention is used. The
- first data byte is scanned from left to right until the
- first 1 bit is encountered. The bit stream is defined to
- begin with the immediately following bit, and run through
- the last bit of the last byte read. In other words, the bit
- stream is 'right-adjusted' in the collected bytes, with its
- left end delimited by the first "on' bit.
-
- For example, to send the bit stream *001010011* (9
- bits), the following bytes are transmitted.
-
- 11110010 00000010 01010011
- ABCDEhij klmnopqr stuvwxyz
-
- The hij=010 value specifies that two bytes follow. The
- q bit, which is the first 1 bit encountered, identifies the
- start of the bit stream as being the r bit. The rstuvwxyz
- bits are the bit stream being handled.
-
- =========================
- | Atomic Object: b-BOOL |
- =========================
-
- The b-BOOL (BOOLean) object is used to transmit
- p-BOOLs. The format of b-BOOL objects follows.
-
- ABCDEFG=1111110 specifying the type as
- b-BOOL
- H=z specifying the value
-
- The two possible translations of a b-BOOL are *FALSE*
- and *TRUE*.
-
- 11111100 represents *FALSE*
- 11111101 represents *TRUE*
- ABCDEFGz
-
- if z=0, the value is FALSE, otherwise TRUE.
-
-
-
- ========================================
- | Atomic Object: B-EMPTY |
- ========================================
-
- The b-EMPTY object type is used to transmit a 'null'
- object, i.e. an *EMPTY*. The format of an b-EMPTY follows.
-
- ABCDEFGH=11111110 specifying *EMPTY*
-
- -12-
- =========================
- | Atomic Object: B-XTRA |
- =========================
-
- The b-XTRA objects are used to carry the four possible
- p-XTRA items, i.e., *XTRA0*, *XTRA1*, *XTRA2*, and *XTRA3*.
- These four items correspond to the binary coding of the
- remaining two bits after the b-XTRA type code bits. The
- format of a b-XTRA follows.
-
- ABCDEF=111110 to specify the type b-XTRA
- GH=yz to identify the particular p-XTRA item
- carried
-
- The GH bits of the byte are decoded to produce a
- particular p-XTRA item, as follows.
-
- GH=00 -- *XTRA0*
- GH=01 -- *XTRA1*
- GH=10 -- *XTRA2*
- GH=11 -- *XTRA3*
-
- The b-XTRA object is included to provide the use of
- several single-byte data items to higher levels. These
- items may be assigned by individual applications to improve
- the efficiency of transmission of several very frequent data
- items. For example, the message services protocols will use
- these items to convey positive and negative acknowledgments,
- two very common items in every interaction.
-
- ========================================
- | Atomic Object: B-PADDING
- ========================================
-
- This object is anomalous, since it represents really no
- data at all. Whenever it is encountered in a byte stream in
- a position where a type-byte is expected, it is completely
- ignored, and the succeeding byte examined instead. Its
- purpose is to serve as a filler in byte streams, providing
- servers with an aid in handling internal problems related to
- their specific word lengths, etc. The encoders may freely
- use this object to serve as padding when necessary.
-
- All b-PADDING data objects exist only within an encoded
- byte stream. They never cause any data item whatsoever to
- be presented externally to the coder module. The format of a
- b-PADDING follows.
-
- ABCDEFGH=11111111
-
- Note that this does not imply that all such 'null'
- bytes in a stream are to be ignored, since they could be
- encountered as a byte within some other type, such as
- b-LINTEGER. Only bytes of this format which, by their
- position in the stream, appear as a 'type' byte are to be
- ignored.
-
- -13-
- VI.4 -- Non-Atomic Objects
-
-
- Non-atomic objects are are always transmitted preceded
- by both a single type byte and some small number of size
- byte(s). The type byte identifies that the data object
- concerned is of a non-atomic type, as well as uniquely
- specifying the particular type involved. All non-atomic
- objects have type byte values of the following form.
-
- ABC=110 specifying that the object is
- non-atomic
- DEFGH=vwxyz specifying the particular type
- of object
-
- The vwxyz value is used to specify one of 31 possible
- non-atomic types. The value vwxyz=00000 is reserved for use
- in future expansion.
-
- In all non-atomic data objects, the byte(s) following
- the type-byte specify the number of bytes to follow which
- contain the data object. In all cases, if the number of
- bytes specified are processed, the next byte to be seen
- should be another type-byte, the beginning of the next
- object in the stream.
-
- The number of bytes containing the object size
- information is variable. These bytes will be termed the
- SIZE-BYTES. The first byte encountered has the following
- format.
-
- A=s specifying the manner in which the size
- information is encoded
- BCDEFGH=tuvwxyz specifying the size, or
- number of bytes containing the size
-
- The tuvwxyz values supply a positive binary number. If
- the s value is a one, the tuvwxyz value specifies the number
- of bytes to follow which should be read and concatenated as
- a binary number, which will then specify the size of the
- object. These bytes will appear with high order bits first.
- Thus, if s=1, up to 128 bytes may follow, containing the
- count of the succeeding data bytes, which should certainly
- be sufficient.
-
- Since many non-atomic objects will be fairly short, the
- s=0 condition is used to indicate that the 7 bits contained
- in tuvwxyz specify the actual data byte count. This permits
-
- objects of sizes up to 128 bytes to be specified using one
- size-information byte. The case tuvwxyz=0000000 is
- interpreted as specifying 128 bytes.
-
- For example, a data object of some non-atomic type
- which requires 100 (144 octal) bytes to be transmitted would
- be sent as follows.
-
- -14-
-
- 110XXXXX -- identifying a specific
- non-atomic object
- 01100100 -- specifying that 100 bytes follow
- .
- .
- data -- the 100 data bytes
- .
- .
-
- Note that the size count does not include the
- size-specifier byte(s) themselves, but does include all
- succeeding bytes in the stream used to encode the object.
-
- A data object requiring 20000 (47040 octal) bytes would
- appear in the stream as follows.
-
- 110XXXXX -- identifying a specific
- non-atomic object
- 10000010 -- specifying that the next 2 bytes
- contain the stream length
- 01001110 -- first byte of number 20000
- 00100000 -- second byte
- .
- .
- data -- 20,000 bytes
- .
- .
-
- Interpretation of the contents of the 20000 bytes in
- the stream can be performed by a module which knows the
- specific format of the non-atomic type specified by DEFGH in
- the type-byte.
-
- The remainder of this section defines an initial set of
- non-atomic types, the format of their encoding, and the
- semantics of their interpretation.
-
-
- ================================
- | Non-atomic Object: B-LBITSTR |
- ================================
-
- The b-LBITSTR (Long BIT Stream) data type is introduced
- to transmit p-BITS which cannot be handled by a b-SBITSTR.
- A b-LBITSTR may be used to transmit short p-BITS as well.
- Its format follows.
-
-
-
-
-
-
-
-
-
-
- -15-
-
- 11000001 size-bytes data-bytes
- ABCDEFGH
-
- ABC=110 identifies this as a non-atomic object.
- DEFGH=00001 specifies that it is a b-LBITSTR. The standard
- sizing information specifies the number of succeeding bytes.
- Within the data-bytes, the first object encountered must
- decode to a p-INT. This number conveys the length of the
- bit stream to follow. The actual bit stream begins with the
- next byte, and is left-adjusted in the byte stream. For
- example to encode *101010101010*, the following b-LBITSTR
- could be used, although a b-SBITSTR would be more compact.
-
- 11000001 -- identifies a b-LBITSTR
- 00000010 -- b-SINTEGER, to specify length
- 10001100 -- size = 2
- 10101010 -- first 8 data bits
- 10100000 -- last 4 data bits
-
-
-
- ==============================
- | Non-atomic Object: B-STRUC |
- ==============================
-
- The b-STRUC (STRUCture) data type is used to transmit
- any p-STRUC. The translation rules for converting a b-STRUC
- into a primitive item are presented following the discussion
- of b-REPEATs. The b-STRUC format appears as follows.
-
- 11000010 size-bytes data-bytes
- ABCDEFGH
-
- ABC=110 identifies this as a non-atomic type.
- DEFGH=00010 specifies that the object is a b-STRUC. Within
- the data-bytes stream, objects simply follow in order. This
- implies that the b-STRUC encoder and decoder modules can
- simply make use of recursive calls to a standard
- encoder/decoder for processing each element of the b-STRUC.
-
- Note that any type of object is permitted as an element of a
- b-STRUC, but the size information of the b-STRUC must
- include all bytes used to represent the elements.
-
- Containment of b-STRUCs within other b-STRUCs is
- permitted to any reasonable level. That is, a b-STRUC may
- contain as an element another b-STRUC, which contains
- another b-STRUC, and so on. All servers are requires to
- handle such containment to at least a minimum depth of
- three.
-
- Examples of encoded structures appear in a later
- section.
-
-
- -16-
- ============================
- | Non-atomic Object: B-EDT |
- ============================
-
- A b-EDT is the object used as the carrier for p-EDTs in
- transmission of semantic items. It is functionally
- identical to a b-STRUC, but has a different type code to
- permit it to be identified and converted to a semantic item
- instead of a p-STRUC. The format of a b-EDT follows.
-
- 11000011 size-bytes data-bytes
- ABCDEFGH
-
- As with all non-atomic types, ABC=110 to identify this
- as such, and DEFGH=00011 to specify a b-EDT. The objects in
- the data-bytes are decoded as for b-STRUCs. However, the
- first object must decode to a p-iNT or p-STRING and the
- second to a p-INT, to conform to the format of p-EDTs.
-
-
-
- ===============================
- | Non-atomic Object: b-REPEAT |
- ===============================
-
-
- The b-REPEAT object is never translated directly into
- an item. It is legal only as an component of an enclosing
- b-STRUC, b-USTRUC, b-EDT, or b-REPEAT. A b-REPEAT is used to
- concisely specify a set of elements to be treated as if they
- appeared in the enclosing structure in place of the
- b-REPEAT. This provides a mechanism for encoding a sequence
- of identical data items or patterns efficiently for
- transmission.
-
- A common example of this would be in transmission of
- text, where line images containing long sequences of spaces,
- or pages containing multiple carriage-return, line-feed
- pairs, are often encountered. Such sequences could be
- encoded as an appropriate b-REPEAT to compact the data for
- transmission. The format of a b-REPEAT is as follows.
-
- 11000100 -- identifyIng the object as a
- b-REPEAT
- size-bytes -- the standard non-atomic object
- size information
- countspec -- an object which translates to a p-INT
- .
- .
- data -- the objects which define the pattern
- .
- .
-
- The 'countspec' object must translate to an p-INT to
- specify the number of times that the following data pattern
- should be repeated in the object enclosing the b-REPEAT.
-
- -17-
-
- The remaining objects in the b-REPEAT constitute the
- data pattern which is to be repeated. The decoding of the
- enclosing structure will be continued as if the data pattern
- objects appeared 'countspec' times in place of the b-REPEAT.
- Zero repeat counts are permitted, for generality. They
- cause no objects to be simulated in the enclosing structure.
-
- An encoder does not have to use b-REPEATs at all, if
- simplicity of coding outweighs the benefits of data
- compression. In message services, for example, an encoder
- might limIt itself to only compressing long text strings. It
- is important for compatibility, however, to have the ability
- in the decoders to handle b-REPEATs.
-
- ===============================
- | Non-atomic Object: B-USTRUC |
- ===============================
-
- The b-USTRUC (Uniform Structure) object type is
- provided to enable servers to convey the fact that a p-STRUC
- being transferred contains items of only a single type. The
- most common example would involve a b-USTRUC which
- translates to a p-STRUC of only p-CHARs, and hence may be
- considered to be a p-STRING. Servers may use this
- information to assist them in decoding objects efficiently.
- No server is required to generate b-USTRUCs.
-
- The internal construction of a b-USTRUC is identical to
- that of a b-STRUC, except for the type-byte. The format of a
- b-USTRUC follows.
-
- 11000101 size-bytes data-bytes
- ABCDEFGH
-
- ABC=110 to identify a non-atomic object. DEFGH=00101
- specifies the object as a b-USTRUC.
-
- ===============================
- | Non-atomic Object: B-STRING |
- ===============================
-
- The b-STRING object is included to permit explicit
- specification of a structure as a p-STRING. This
- information will permit receiving servers to process the
- incoming structure more efficiently. A b-STRING is
- formatted similarity to a b-USTRUC, except that its type-byte
- identifies the object as a b-STRI/NG. The normal sizing
- information is followed by a stream of bytes which are
- interpreted as b-CHAR7s, Ignoring the high-order bit. The
- format of a b-STRING follows.
-
- 11000110 size-bytes data-bytes
- ABCDEFGH
-
- ABC=110 to identify a non-atomic object. DEFGH=00110
- specifies the object as a b-STRING.
-
- -18-
-
- VI.5 -- Structure Translation Rules
-
-
- A b-STRUC is translated into a p-STRUC. This is
- performed by translating each object of the b-STRUC Into its
- corresponding item, and saving it for inclusion In the
- p-STRUC being generated. A b-USTRUC is handled similarly,
- but the coding programs may utilize the information that the
- resultant p-STRUC will contain items of uniform type. The
- preferred method of coding p-STRINGS is to use b-USTRUCs.
-
- If all of the elements of the resultant p-STRUC are
- p-CHARs, it is presented to the user of the decoder as a
- p-STRING. A p-STRING should be considered to be a synonym
- for a p-STRUC containing only characters. It need not
- necessarily exist at particular sites which would present
- p-STRUCs of p-CHARs to their application programs
-
- The object b-REPEAT is handled in a special fashion
- when encountered as an element. When this occurs, the data
- pattern of the b-REPEAT is translated into a sequence of
- items, and that sequence is repeated in the next higher
- level as many times as specified in the b-REPEAT.
- Therefore, b-REPEATS are legal only as elements of a
- surrounding b-STRUC, b-USTRUC, b-EDT, or b-REPEAT.
-
- In encoding a p-STRUC or p-STRING for transmission, a
- translator may use b-REPEATs as desired to effect data
- compression, but their use is not mandatory. Similarly,
- b-STRINGS may be used, but are not mandatory.
-
- A b-EDT is translated into a p-EDT to identify it as a
- carrier for a semantic item. Otherwise, it is treated
- identically to a b-STRUC.
-
-
- VI.6 -- Translation Summary
-
-
- The following table summarizes the possible
- translations between primitive items and objects.
-
- p-INT <--> b-LINTEGER, b-SINTEGER
- p-STRING <--> b-STRING, b-STRUC, b-USTRUC
- p-STRUC <--> b-STRING, b-STRUC, b-USTRUC
- p-BITS <--> b=SBITSTR, b-LBITSTR
- p-CHAR <--> b-CHAR7
- p-BOOL <--> b-BOOL
- p-EMPTY <--> b=EMPTY
- p-XTRA <--> b-XTRA
- p-EDT <--> b-EDT (all semantic items)
- -none- <--> b-PADDING
- -none- <--> b-REPEAT (only within structure)
-
- Note that all semantic items are represented as p-EDTs
- which always exist as b-EDTs in byte-stream format.
-
- -19-
- V1.7 -- Structure Coding Examples
-
-
- The following stream transmits a b-STRUC containing 3
- b-SINTEGERs, with values 1, 2, and 3, representing a p-STRUC
- containing three p-INTs, i.e. (1 2 3).
-
- 11000010 -- b-STRUC
- 00000011 -- size=3
- 10000001 -- b-SINTEGER=1
- 10000010 -- b-SINTEGER=2
- 10000011 -- b-SINTEGER=3
-
- The next example represents a b-STRUC containing the
- characters X and Y, followed by the b-LINTEGER 10,
- representing a p-STRUC of 2 p-CHARs and a p-INT, i.e., ('X'
- 'Y' 10). Note that the p-INT prevents considering this a
- p-STRING.
-
- 11000010 -- b-STRUC
- 00000100 -- size=4
- 01011000 -- b-CHAR7 'X'
- 01011001 -- b-CHAR7 'Y'
- 11100001 -- b-LINTEGER
- 00001010 -- 10
-
- Note that a better way to send this p-STRUC would be to
- represent the integer as a b-SINTEGER, as shown below.
-
- 11000010 -- b-STRUC
- 00000011 -- size=3
- 01011000 -- b-CHAR7 'X'
- 01011001 -- b-CHAR7 'Y'
- 10001010 -- b-SINTEGER=10
-
- The next example shows a b-STRUC of b-CHAR7s. It is
- the translation of the b-STRING "HELLO".
-
- 11000010 -- b-STRUC
- 00000101 -- size=5
- 01001000 -- b-CHAR7 'H'
- 01000101 -- b-CHAR7 'E'
- 01001100 -- b-CHAR7 'L'
- 01001100 -- b-CHAR7 'L'
- 01001111 -- b-CHAR7 'O'
-
- This datum could also be transmitted as a b-STRING.
- Note that the character bytes are not necessarily b-CHAR7s,
- since the high-order bit is ignored.
-
- 11000110 -- b-STRING
- 00000101 -- size=5
- 01001000 -- 'H'
- 01000101 -- 'E'
- 01001100 -- 'L'
- 01001100 -- 'L'
- 01001111 -- 'O'
-
- -20-
- To encode a p-STRING containing 20 carriage-return
- line-feed pairs, the following b-STRUC containing a b-REPEAT
- could be used.
-
- 11000010 -- b-STRUC
- 00000101 -- size=5
- 11000100 -- b-REPEAT
- 00000011 -- size=3
- 10010100 -- count, b-SINTEGER=20
- 00001101 -- b-CHAR7, "CR'
- 00001010 -- b-CHAR7, 'IF'
-
- To encode a p-STRUC of p-INTs, where the sequence
- contains a sequence of thirty 0's preceded by a single 1,
- the following b-STRUC could be used.
-
- 11000010 -- b-STRUC
- 00000110 -- size=6
- 10000001 -- b-SINTEGER=1
- 11000100 -- b-REPEAT
- 00000010 -- size=2
- 10011110 -- count, b-SINTEGER=30
- 10000000 -- b-SINTEGER=0
-
-
- VII. A GENERAL DATA TRANSFER SCHEME
-
-
- This section considers a possible scheme for extending
- the concept of a data translator into an multi-purpose data
- transfer mechanism.
-
- The proposed environment would provide a set of
- primitive items, including those enumerated herein but
- extended as necessary to accommodate a variety of
- applications. Communication between processes would be
- defined solely in terms of these items, and would
- specifically avoid any consideration of the actual formats
- in which the data is transferred.
-
- A repertoire of translators would be provided, one of
- which is the MSDTP machinery, for use in converting items to
- any of a number of transmission formats. Borrowing a
- concept from radio terminology, each translator would be
- analogous to a different type of modulation scheme, to be
- used to transfer data through some communications medium.
- Such media could be an eight-bit byte-oriented connection,
- 36-bit connection, etc. and conceivably have other
- distinguishing features, such as bandwidth, cost, and delay.
- For each media which a site supports, it would provide its
- programmers with a module for performing the translations
- required.
-
-
-
-
- -21-
-
- Certain media or translators might not handle various
- items. For example, the MSDTP does not handle items which
- might be termed p-FLOATs, p-COMPLEXs, p-ARRAY, and so on. In
- addition, the efficiency of various media for transfer of
- specific items may differ drastically. MSDTP, for example,
- transfers data frequently used in message handling very
- efficiently, but is relatively poor at transfer of very
- large or deep tree structures.
-
- Available at each site as a process or subroutine
- package wouLd be a module responsible for interfacing with
- its counterpart at the other end of the media. These
- modules would use a protocol, not yet defined, to match
- their capabilities, and choose a particular media and
- translator, when more than one exists, for transfer of data
- items.
-
- Such a facility could totally insulate applications
- from need to consider encoding formats, machine differences,
- and so on, as well as eliminate duplication of effort in
- producing such facilities for every new project which
- requires them. In addition, as new translators or media are
- introduced, they would become immediately available to
- existing users without reprogramming.
-
- Implementation of such a protocol should not be very
- difficult or time-consuming, since it need not be very
- sophisticated in choosing the most appropriate transfer
- mechanism in initial implementations. The system is
- inherently upward-compatible and easily expandable.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- -22-
-